A Permutation-based Model for Crowd Labeling: Optimal Estimation and Robustness

نویسندگان

  • Nihar B. Shah
  • Sivaraman Balakrishnan
  • Martin J. Wainwright
چکیده

The aggregation and denoising of crowd labeled data is a task that has gained increased significance with the advent of crowdsourcing platforms and massive datasets. In this paper, we propose a permutation-based model for crowd labeled data that is a significant generalization of the common Dawid-Skene model, and introduce a new error metric by which to compare different estimators. Working in a high-dimensional non-asymptotic framework that allows both the number of workers and tasks to scale, we derive optimal rates of convergence for the permutationbased model. We show that the permutation-based model offers significant robustness in estimation due to its richness, while surprisingly incurring only a small additional statistical penalty as compared to the Dawid-Skene model. Finally, we propose a computationally-efficient method, called the OBI-WAN estimator, that is uniformly optimal over a class intermediate between the permutation-based and the Dawid-Skene models, and is uniformly consistent over the entire permutation-based model class. In contrast, the guarantees for estimators available in prior literature are sub-optimal over the original Dawid-Skene model.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

MILP Formulation and Genetic Algorithm for Non-permutation Flow Shop Scheduling Problem with Availability Constraints

In this paper, we consider a flow shop scheduling problem with availability constraints (FSSPAC) for the objective of minimizing the makespan. In such a problem, machines are not continuously available for processing jobs due to preventive maintenance activities. We proposed a mixed-integer linear programming (MILP) model for this problem which can generate non-permutation schedules. Furthermor...

متن کامل

Analysis of Response Robustness for a Multi-Objective Mathematical Model of Dynamic Cellular Manufacturing

The multi-objective optimization problem is the main purpose of generating an optimal set of targets known as Pareto optimal frontier to be provided the ultimate decision-makers. The final selection of point of Pareto frontier is usually made only based on the goals presented in the mathematical model to implement the considered system by the decision-makers. In this paper, a mathematical model...

متن کامل

A HYBRID SUPPORT VECTOR REGRESSION WITH ANT COLONY OPTIMIZATION ALGORITHM IN ESTIMATION OF SAFETY FACTOR FOR CIRCULAR FAILURE SLOPE

Slope stability is one of the most complex and essential issues for civil and geotechnical engineers, mainly due to life and high economical losses resulting from these failures. In this paper, a new approach is presented for estimating the Safety Factor (SF) for circular failure slope using hybrid support vector regression (SVR) and Ant Colony Optimization (ACO). The ACO is combined with the S...

متن کامل

Observability-Enhanced PMU Placement Considering Conventional Measurements and Contingencies

Phasor Measurement Units (PMUs) are in growing attention in recent power systems because of their paramount abilities in state estimation. PMUs are placed in existing power systems where there are already installed conventional measurements, which can be helpful if they are considered in PMU optimal placement. In this paper, a method is proposed for optimal placement of PMUs incorporating conve...

متن کامل

Crowd density estimation based on statistical analysis of local intra-crowd motions for public area surveillance

Crowd density estimation in public areas with people gathering and waiting has been a challenging problem for visual surveillance over many years. Tiny motions, like when people turn around, wander about, and turn their heads, happen randomly now and then in crowds, which makes it difficult to achieve high-performance crowd density estimation based on traditional foreground detection. A novel a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1606.09632  شماره 

صفحات  -

تاریخ انتشار 2016